- Build infrastructure and automation for the extraction, preparation, and loading of data from various sources;
- Create unit and stress test components to monitor technical performance and ensure identified issues are resolved;
- Build and maintain analytical tools to provide data insight and capture key metrics;
- Automate and integrate new components into the data pipeline;
- Utilize best practices for data governance, quality, cleansing, and other ETL-related activities;
- Maintain technical documentation.
- 3+ years of development experience in data engineering;
- 1+ years of professional experience working in big data ecosystems, preference for Spark;
- 1+ years of professional experience working with data flow management tools, such as Airflow;
- 1+ years experience working with Pentaho (or equivalent tools such as Talend, DataStage, and Informatica)2Hands-on scripting experience with Python, Scala, and/or shell scripting;
- Preference for development experience in highly scalable, distributed systems and cluster architectures (e.g. AWS, Azure, Google Cloud, etc);
- Familiarity with complex NoSQL databases (e.g. DynamoDB, Cassandra, Elasticsearch, etc);
- Prior experience working with large data sets ( >1M+ records);
- B.S. preferred in Computer Science, Information Systems, or related fields (foreign education equivalent accepted).
Company
Location
Lisbon - Portugal
Job type
Full-Time
Python Job Details
As a Data Engineer, you will develop, optimize, and maintain the ETL data pipeline. This involves working with infrastructure built in AWS, including Spark EMR, S3, and DynamoDB. Additionally, this role will help build analytical tools, develop unit and stress tests, and create automation surrounding the orchestration of the ETL data pipeline.
Responsibilities:
Requirements:
Job Type: Full-time
Pay: 3,000.00€ - 7,000.00€ per month
More Developer Job Boards
Fullstack Developer Jobs Golang Jobs JavaScript Jobs Python Jobs React Jobs Rust Jobs Java Jobs